Skip to content

Conversation

@basnijholt
Copy link
Contributor

@basnijholt basnijholt commented Aug 13, 2025

Problem

The flake.nix includes references to llama-cpp.cachix.org with a comment claiming it's "Populated by the CI in ggml-org/llama.cpp", but this is misleading and caused confusion during setup.

Personal Experience

I spent over an hour debugging why llama.cpp builds were taking so long, assuming the cache was functional based on the documentation. Only after extensive investigation did I realize the cache is not being populated.

Evidence the cache is non-functional:

  1. No CI workflow exists: Searched .github/workflows/ - no workflows populate the cachix cache

  2. Cache is accessible but empty for recent builds:

    $ nix store info --store https://llama-cpp.cachix.org
    Store URL: https://llama-cpp.cachix.org  # ← accessible
    
    $ nix build github:ggml-org/llama.cpp#cuda --dry-run
    these 5 derivations will be built:  # ← llama-cpp must be built locally
      /nix/store/2jqbcad4ig5indz3j88mjf05sjdhk9sc-cuda_nvcc-12.4.99.drv
      /nix/store/3q7gy8qnivrsbwkp0a7980475klv8hr0-cuda_cccl-12.4.99.drv
      /nix/store/mhk9108dvvlafr6p5xs55wa0k2ymanry-cuda_cudart-12.4.99.drv
      /nix/store/q2lqgcsnwsp765jqck7walgw47q5g4b0-libcublas-12.4.2.65.drv
      /nix/store/78lqkcz6n4fc3ck5mq7v2pry0qv81iys-llama-cpp-cuda-0.0.0.drv  # ← NOT cached
    these 155 paths will be fetched (704.42 MiB download, 1403.71 MiB unpacked)

    Note: Dependencies are cached (155 paths, 704MB), but llama.cpp itself is not.

Solution

Remove the non-functional cache references entirely, leaving only the working cuda-maintainers.cachix.org cache that actually provides CUDA dependencies.

This prevents other users from going through the same debugging process I did. The cache can always be re-added later if CI gets set up to populate it.

Testing

  • Verified no GitHub Actions workflows populate the cache
  • Confirmed llama-cpp builds not available in cache (must be built locally)
  • Tested cache accessibility (works, but empty for llama.cpp binaries)
  • Dependencies are properly cached via cuda-maintainers cache
  • Documentation-only change, no functional impact

The flake.nix included references to llama-cpp.cachix.org cache with a comment
claiming it's 'Populated by the CI in ggml-org/llama.cpp', but:

1. No visible CI workflow populates this cache
2. The cache is empty for recent builds (tested b6150, etc.)
3. This misleads users into expecting pre-built binaries that don't exist

This change removes the non-functional cache references entirely, leaving only
the working cuda-maintainers cache that actually provides CUDA dependencies.

Users can still manually add the llama-cpp cache if it becomes functional in the future.
@github-actions github-actions bot added the nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment label Aug 13, 2025
@ggerganov ggerganov requested a review from philiptaron August 13, 2025 18:18
@philiptaron philiptaron merged commit 1adc981 into ggml-org:master Aug 13, 2025
2 checks passed
the-phobos pushed a commit to the-phobos/llama.cpp that referenced this pull request Aug 14, 2025
…ggml-org#15295)

The flake.nix included references to llama-cpp.cachix.org cache with a comment
claiming it's 'Populated by the CI in ggml-org/llama.cpp', but:

1. No visible CI workflow populates this cache
2. The cache is empty for recent builds (tested b6150, etc.)
3. This misleads users into expecting pre-built binaries that don't exist

This change removes the non-functional cache references entirely, leaving only
the working cuda-maintainers cache that actually provides CUDA dependencies.

Users can still manually add the llama-cpp cache if it becomes functional in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants